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Abstract — Kernel method is a very powerful tool in machine 
learning. The trick of kernel has been effectively and extensively 
applied in many areas of machine learning, such as support 
vector machine (SVM) and kernel principal component analysis 
(kernel PCA). Kernel trick is to define a kernel function which 
relies on the inner-product of data in the feature space without 
knowing these feature space data. In this paper, the kernel trick 
will be employed to extend the algorithm of spectrum sensing 
with leading eigenvector under the framework of PCA to a 
higher dimensional feature space. Namely, the leading eigenvector 
of the sample covariance matrix in the feature space is used 
for spectrum sensing without knowing the leading eigenvector 
explicitly. Spectrum sensing with leading eigenvector under the 
framework of kernel PCA is proposed with the inner-product 
as a measure of similarity. A modified kernel GLRT algorithm 
based on matched subspace model will be the first time applied 
to spectrum sensing. The experimental results on simulated 
sinusoidal signal show that spectrum sensing with kernel PCA 
is about 4 dB better than PCA, besides, kernel GLRT is also 
better than GLRT. The proposed algorithms are also tested on 
the measured DTV signal. The simulation results show that kernel 
methods are 4 dB better than the corresponding linear methods. 
The leading eigenvector of the sample covariance matrix learned 
by kernel PCA is more stable than that learned by PCA for 
different segments of DTV signal. 

Index Terms — Kernel, spectrum sensing, support vector ma- 
chine (SVM), kernel principal component analysis (kernel PCA), 
kernel generalized likelihood ratio test (kernel GLRT). 



I. Introduction 

Spectrum sensing is a cornerstone in cognitive radio Q], 
Q, which detects the availability of radio frequency bands 
for possible use by secondary user without interference to pri- 
mary user. Some traditional techniques proposed for spectrum 
sensing are energy detection, matched filter detection, cyclo- 
stationary feature detection, covariance-based detection and 
feature based detection (3)-|ll|. Spectrum sensing problem 
is nothing but a detection problem. 

The secondary user receives the signal y(t). Based on the 
received signal, there are two hypotheses: one is that the 
primary user is present Hi, another one is the primary user is 
absent Hq. In practice, spectrum sensing involves detecting 
whether the primary user is present or not from discrete 
samples of y(t). 
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in which x(n) are samples of the primary user's signal and 
w(n) are samples of zero mean white Gaussian noise. In 
general, the algorithms of spectrum sensing aim at maximizing 
corresponding detection rate at a fixed false alarm rate with 
low computational complexity. The detection rate Pd and false 
alarm rate Pt are defined as 



Pd = prob (detect H\\y{n) 
Pf = prob (detect Hi\y(n) 



x(n) + w(n)) 
: w(n)) 



(2) 



in which prob represents probability. 

Kernel methods fT2)-fl5) have been extensively and suc- 
cessfully applied in machine learning, especially in support 
vector machine (SVM) fl6) , (T7J. Kernel methods are coun- 
terparts of linear methods which implement in feature space. 
The data in original space can be mapped to different feature 
spaces with different kernel functions. The diversity of feature 
spaces gives us more choice to gain better performance's 
algorithm than only in the original space. 

A kernel function which just relies on the inner-product of 
feature space data is defined as fl8) 



(3) 



to implicitly map the original space data x into a higher 
dimensional feature space F, where tp is the mapping from 
original space to feature space. The dimension of ^(x) can be 
infinite, such as Gaussian kernel. Thus the direct operation on 
<^(x) may be computationally infeasible. However, with the 
use of the kernel function, the computation will only rely on 
the inner-product between the data points. Thus the extension 
of some algorithms to even an arbitrary dimensional feature 
space becomes possible. 

< Xj,Xj > is the inner-product between x^ and Xj. A 
function k is a valid kernel if there exists a mapping tp 
satisfying Eq. ([3J. Mercer's condition [18] gives us the 
condition about what kind of functions are valid kernels. 
Kernel functions allow the linear method to generalize to a 
non-linear method without knowing tp explicitly. If the data in 
original space embodies nonlinear structure, kernel methods 
can usually obtain better performance than linear methods. 

Spectrum sensing with leading eigenvector of the sample 
covariance matrix is proposed and hardware demonstrated 
in JTT) successfully under the framework of PCA. The 
leading eigenvector of non-white wide-sense stationary (WSS) 
signal has been proved stable In this paper, spectrum 

sensing with leading eigenvector of the sample covariance 
matrix of feature space data is proposed. The kernel trick is 
employed to implicitly map the original space data to a higher 
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dimensional feature space. In the feature space, the inner- 
product is taken as a measure of similarity between leading 
eigenvectors without knowing leading eigenvectors explicitly. 
That is to say spectrum sensing with leading eigenvector under 
the framework of kernel PCA is proposed with the inner- 
product as a measure of similarity. 

Several generalized likelihood ratio test (GLRT) fl9|, p0| 



algorithms have been proposed for spectrum sensing. Kernel 
GLRT pTJ algorithm based on matched subspace model 
p2| is proposed and applied to hyperspectral target detection 
problem, which assumes that the target and background lie 
in the known linear subspaces [T] and [B]. T and B are 
orthonormal matrices with the columns of each spanning the 
subspaces [T] and [B]. T and B consist of eigenvectors 
corresponding to nonzero eigenvalues of the sample covariance 
matrices of target and background, respectively. The identity 
projection operator in the feature space is assumed to map 
(p(x) onto the subspace consisting of the linear combinations 
of column vectors of T and B . 

In this paper, modified kernel GLRT algorithm based on 
matched subspace model will be the first time employed for 
spectrum sensing without consideration of background. On the 
other hand, the identity projection operator in the feature space 
is assumed to map y(x) as <^(x) in this paper. 

The contribution of this paper is as follows: Detection algo- 
rithm with leading eigenvector will be generalized to feature 
spaces which are determined by the choice of kernel func- 
tions. Simply speaking, leading eigenvector detection based on 
kernel PCA is proposed for spectrum sensing. Different from 
PCA, the similarity of leading eigenvectors will be measured 
by inner-product instead of the maximum absolute value of 
cross-correlation. A modified version of kernel GLRT will be 
introduced to spectrum sensing which considers the perfect 
identity projection operator in feature space without involving 
background signal. DTV signal [23] captured in Washington 
D.C. will be employed to test the proposed kernel PCA and 
kernel GLRT algorithms for spectrum sensing. 

The organization of this paper is as follows. In section \U\ 
spectrum sensing with leading eigenvector under the frame- 
work of PCA will be reviewed. Detection with leading eigen- 
vector will be extended to feature space by use of kernel. 
The proposed algorithm that spectrum sensing with leading 
eigenvector under the framework of kernel PCA will be 
introduced in section [EI] GLRT and modified kernel GLRT 
algorithms for spectrum sensing based on matched subspace 
model will be introduced in section III The experimental 



results on simulated sinusoidal signal and DTV signal are 



shown in section IV The corresponding kernel methods will 
be compared with linear methods. Finally, the paper is con- 
cluded in section |V] 

II. Spectrum Sensing with PCA and Kernel PCA 

The d— dimensional received vector is y = (y(n),y(n + 
1), y(n + d — 1)) T , therefore, 



H :y = w 
Hi : y = x + w 



in which x = (x(ri), x(n + 1), x(n + d — 1)) T and w = 
(w(n), w(n + l), w(n + d— 1)) T . Assuming the samples of 
the primary user's signal is known priorly with length L > d, 
x(n),x(n + 1), ...,x(L — 1). The training set consists of 

xi = (x(n), x(n + 1), x(n + d — 1)) T , 

x 2 = (x(n + i), x(n + i + 1), x{n + i + d — 1)) T , 

x M = (x(n+ (M - l)i),x(n+ (M - l)i + 1), 
...,x(n+ (M - l)i + d- 1)) T , 

(5) 

where M is the number of vectors in the training set and i is 
the sampling interval. T represents transpose. 

A. Detection Algorithm with Leading Eigenvector under the 
Framework of PCA 

The leading eigenvector (eigenvector corresponding to the 
largest eigenvalue) of the sample covariance matrix of the 
training set can be obtained which is taken as the tem- 
plate of PCA method. Given d-dimensional column vectors 
x 1 ,x 2 , - - , x,m of the training set, the sample covariance 
matrix can be obtained by 

1 M 

i=l 

which assumes that the sample mean is zero, 



M 

M ^ 



0. 



(7) 



The leading eigenvector of R x can be extracted by eigen- 
decomposition of R x , 



R T = VAV T 



(8) 



where A = diag(X%, A2, Ad) is a diagonal matrix. A,,z = 
1, 2, • • ■ ,d are eigenvalues of R x . V is an orthonormal matrix, 
the columns of which v 1 ,v 2 ,--- , v<j are the eigenvectors 
corresponding to the eigenvalues A,*,i = 1,2, ••• ,d. For 
simplicity, take vi as the eigenvector corresponding to the 
largest eigenvalue. The leading eigenvector Vi is the template 
of PCA. 

For the received samples (y(n),y(n + l),...,y(L — 1)), 
likewise, vectors y,, i = 1, 2, • • • , M can be obtained by d3). 
(Indeed, the number of the training set is not necessarily equal 
to the number of the received vectors, here, for simplicity, we 
use the same M to denote both of them.) The leading eigen- 

M 

vector vi of the sample covariance matrix R y ~ jj ^2 YiyJ 

»=i 

is obtained. The presence of x(n) in y(n) is determined by 

d 



p = max 
1=0,1,.. .,d 



fc=i 



>T V , 



(9) 



(4) 



where T pca is the threshold value for PCA method, and p is 
the similarity between vi and template vi which is measured 
by cross-correlation. T pca is assigned to arrive a desired false 
alarm rate. The detection with leading eigenvector under the 
framework of PCA is simply called PCA detection. 
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B. Detection Algorithm with Leading Eigenvector under the 
Framework of Kernel PCA 

A nonlinear version of PCA-kernel PCA [24]- has been 
proposed based on the classical PCA approach. Kernel func- 
tion is employed by kernel PCA to implicitly map the data into 
a higher dimensional feature space, in which PCA is assumed 
to work better than in the original space. By introducing the 
kernel function, the mapping ip need not be explicitly known 
which can obtain better performance without increasing much 
computational complexity. 

The training set x$, i = 1, 2, • • ■ , M and received set y,, i = 
1, 2, • • • , M in kernel PCA are obtained the same way as with 
PCA framework. 

The training set in the feature space are 
</?(xi), <jo(x2), jj(xm) which are assumed to have 

M 

zero mean, e.g., ^ v( x i) = 0. Similarly, the sample 

i=l 

covariance matrix of v?( x i) is 



R 



1 M 
— Y 

i=l 

J 



ip{yLi)ip{yLi) r ' 



(10) 



The leading eigenvector v{ of R^) corresponding to the 
largest eigenvalue A{ satisfies 



R 



¥>(z) V l 



Afvf 

M 



h E v(x i )^(x l ) T vf = Afvf 



M 



i=l 
M 



(ID 



The last equation in ( [TTj ) implies that the eigenvector 
vf is the linear combination of the feature space data 

<£>(xi),y>(x 2 ), ...,(j(xjk), 



M 

vf = ^/3i¥>(xj). 



(12) 



Substituting ([12]) into ([TTJ, 

M A/ M 

- e v(xi)^(xi) T E ^^( x i) = A ( E ft^te) < 13 > 

i=l 3 = 1 j = l 

and left multiplying c/?(x t ) T , £ = 1, 2, - • ■ , M to both sides of 
([TJJ, yields 



M M 

iE< <p( x i) > E ft- < vO^^xj) > 
i=i j=i 

f M 

= A i E ^ < ^( x t), ^( x i) > ■ 



(14) 



By introducing the kernel matrix K = (k(xj. x^))^ = (< 
ip(xi),tp(xj) >)ij and vector f3 1 = (B 1; f} 2 , /3m) T , eq. 
(JT4j becomes 

K 2 ^ = MAfKyOj => K/3 X = M\{^. (15) 

It can be seen that (3 1 is the leading eigenvector of the kernel 
matrix K. The kernel matrix K is positive semidefinite. 

Thus, the coefficients j3{ in ( fT2j ) for vf can be obtained by 
eigen-decomposition of the kernel matrix K which has been 



proved in [24 before. The normalization of vf can be derived 
by (24) 



=< vf, vf > 

M M 

=< E A<p( x i), E PM x t) > 

i=l i=l 
M 

= E PiPj<<p(xi)Mx-i) > ( 16 ) 
= /3fK/3! 

= j9fMi)9i 

= Mi < /3i,/3i > 



in which /ii is the eigenvalue corresponding to the eigenvector 
(3 1 of K. 

In the traditional kernel PCA approach p4[ , the first 
principal component of a random point <£>(x) in the feature 
space can be extracted by 



M 



<V?(x),vf> =EA<^(x) 1 p(x f )> 



! = 1 
A/ 

E Ak(x,x 4 ), 

i=l 



(17) 



without knowing vf explicitly. 

However, instead of computing principal components in 
the feature space, the leading eigenvector vf is needed as 
the template for the detection problem. Though vf can be 
written as the linear combination of y(xi), <^( x 2), ■ ••,¥>( x m) 
in which the coefficients are entries of leading eigenvector of 
K, because <p(x.i), fi^), ( / 3 ( x m) are not given, the leading 
eigenvector vf is still not explicitly known. 

In this paper, a detection scheme based on the leading 
eigenvector of the sample covariance matrix in the feature 
space is proposed without knowing vf explicitly. 

Given the received vectors yj,i = 1,2, ■•• , M, likewise, 
the leading eigenvector vf of the sample covariance matrix 
R^m is the linear combination of the feature space data 

^(yi)>v 5 (y2), -,¥>(yM), e.g., 



M 

vf = E&^y* 

i=l 



(18) 



01 = {Pi, 02, Pm) T is the leading eigenvector of the kernel 
matrix 



K = (k(y,-, yj))ij = (< <p(yi),<p(yj) >)< 



(19) 



As is well-known that inner-product is one kind of similarity 
measure. Here, the similarity between vf and vf is measured 
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Samples of primary user's signal 



v,' =(p(x,),^(x 2 ),...,p(x M ))P, 
v,' =(p(y,),(»(y 2 ),..., 9 -(y M ))j! l 



Leading 

K = (k(x,.,x .)),., eigenvector 



Eigen-decompositionK 



Samples of received signal 




Normalize v, , v; 



Determine a threshold 77 



K = (k(y„y.)) ! , 



Leading 
eigenvector 

( B ( 



Eigen-decomposition K 



p = p ] , K'p l compare with7J j( 



Fig. 1. The flow chart of the proposed kernel PCA algorithm for spectrum 
sensing 



by inner-product. 



M M 

i=i j=i 



= {(v( x i)> v(x 2 ), v(xm))^i} T ' 

{(^(yi)> ^(ya), v(yAf))3i} 

/ ^(xi) T \ 

= . (v(yi), ( r 5(y2),-,y(yM))/3i 

V ^(xm) t / 

/ k(xi,yi),k(xi,y 2 ), ...,k(xi,y M ) \ 
pT k(x 2 ,yi),k(x 2 ,y 2 ), ...,k(x 2 ,y M ) 

\ k(x M ,yi),k(x M ,y 2 ), ...,k(x M ,yM) / 

(20) 

K* is the kernel matrix between ^(x^) and ip(yj). A 
measure of similarity between vf and v{ has been obtained 
without giving v/ and v{ based on ( |2"0"] >. 



01 



The proposed detection algorithm with leading eigenvector 
under the framework of kernel PCA is summarized here as 
follows: 



1) Choose a kernel function k- Given the training set of the 
primary user's signal xi, x 2 , • • • , xjf, the kernel matrix 
is K = (k(xj,Xj))j,-. K is positive semidefinite. Eigen- 
decomposition of K to obtain the leading eigenvector 

2) The received vectors are yi,y 2 ,--- ,Ym- Based on 
the chosen kernel function, the kernel matrix K = 
(k(yi, yj))w is obtained. The leading eigenvector 1 
is also obtained by eigen-decomposition of K. 

3) The leading eigenvectors for R v ( x 
expressed as 



and Rwj/j can be 



v( = (<p(xi),^(x 2 ),...,¥'(x A f))/3 1 , 

(y(yi), ( /'(y2),-, ^p{ym))Pi- 



(21) 



4) Normalize w{ and v{ by (fl6j. 

5) The similarity between vf and v{ is 



(22) 



6) Determine the presence or absence of primary signal 
x(n) in y(n) by evaluating p > Tfe pca or not. 

Tkpca is the threshold value for kernel PCA algorithm. The 
flow chart of the proposed kernel PCA algorithm for spectrum 
sensing is shown in Fig. [T] The detection with leading eigen- 
vector under the framework of kernel PCA is simply called 
kernel PCA detection. The templates of PCA can be learned 
blindly even at very low signal to noise ratio (SNR) p5| . 

So far the mean of (p(xi), i = 1, 2, • • • M has been assumed 
to be zero. In fact, the zero mean data in the feature space are 



¥>(Xi) 



1 



M 

M ^ 



(23) 



The kernel matrix for this centering or zero mean data can be 
derived by J24| 



K c = K 1 M K - Kl 



ImKI 



A I 



(24) 



in which (1m) ij '■— 1/M. The centering in feature space is 
not done in this paper. 

Some commonly used kernels are as follows: polynomial 
kernels 



k(xi,x,-) = (< Xi,Xj > +c) de ,c> 0, 



(25) 



where de is the order of the polynomial, radial basis kernels 
(RBF) 

k(xj,Xj-) = exp(-7 ||xi 



x,l| 2 ), 



and Neural Network type kernels 

k(x l ,x J ) = tanh(< Xj,Xj > +b), 
in which the heavy-tailed RBF kernel is in the form of 

k(xi,Xj) = exp(-7 ||x? - x°|| b ), 
and Gaussian RBF kernel is 



k(xi,Xj 



exp 



2a 2 



(26) 



(27) 



(28) 



(29) 
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III. Spectrum Sensing with GLRT and Kernel GLRT <x , b\ can be cast as 

GLRT and kernel GLRT methods considered in this paper 
also assume that there is a training set xi, x 2 , • • • , xjf for 
the primary user's signal, in which Xj,i = 1,2, ••• ,M are 
d— dimensional column vectors. The primary user's signal is 
assumed to lie on a given linear subspace [T]. The training 
set is used to estimate this subspace [T]. 

Given the training set Xj,i = 1,2, ••• , M, the sample 
covariance matrix R x is obtained by |6|. The eigenvectors 
of R x corresponding to nonzero eigenvalues are taken as the 
bases of the subspace [T]. 

Kernel GLRT [21] based on matched subspace model for 
hyperspectral target detection has been proposed which takes 
into account the background. The background information can 
be taken as interference in spectrum sensing. In this paper the 
modified kernel GLRT algorithm based on matched subspace 
model is proposed for spectrum sensing without taking into 
consideration the interference. 



^ = ^l|wol| 2 
<^l|wi|| 2 . 



(34) 



Substituting the maximum likelihood estimates of the pa- 
rameters into ( (31} and taking d/2 root, GLRT is expressed as 



P = 



- y Piy 
~ y T (Pi-P T )y 
y T y 

= y T (I _ TT T )y 



(35) 



where Pi = I is the identity projection operator, and Pr is 
the projection onto the subspace [T], 



Pi 



X(T T)~ T = XT 



(36) 



The detection result is evaluated by comparing p of GLRT 
with a threshold value T g i rt . 



A. GLRT Based on Matched Subspace Model 

The GLRT approach in this paper is based on the linear 
subspace model |22| in which the primary user's signal 
is assumed to lie on a linear subspace [T]. Receiving one 
d— dimensional vector y, the two hypotheses Ho and Hi can 
be expressed as 

n : y = w 

«i:y = T0 + w. K ' 

[T] is spanned by the column vectors of T. T is an orthonor- 
mal matrix, T T T = I in which I is an identity matrix. is 
the coefficient's vector in which each entry representing the 
magnitude on each basis of [T]. w is still white Gaussian 
noise vector which obeys multivariate Gaussian distribution 
N(0,a 2 I). 

For the received vector y, LRT approach detects between 
the two hypotheses Ho and Hi by 

A(y|^l) ^} rp ,r,^ 

P = f I \u \ < T lrt (31) 

/o(y|tto) ^ 

in which TJ ri is the threshold value of LRT approach. 
/i(y|^i) an d fo{y\7~Lo) are conditional probability densities 
which follow Gaussian distributions, 

Ho :/o(y|H ) :N(0,a%I) 



= (2.4)^ ex P(-2^ll w "H 2 )' 
Ui :fi(y\Hi) :JV(T0,<t?i) 



(32) 



B. Kernel GLRT Based on Matched Subspace Model 

Accordingly, if Ho , Hi also obey Gaussian distributions 

Ho ¥ : (p(y) = w v 
Ki v : (p(y) = T^O^ + w^, 



(37) 



then GLRT can be extended to the feature space of <p(y), 



= y(y) p i^(y) 

|wi II 2 tp{y) T {P lv -PT^)ip(y) 



(38) 



where Pi is the identity projection operator in the feature 
space. [T^,] is the linear space that the primary user's signal in 
the feature space lies on. Each column of T v is the eigenvector 
corresponding to the nonzero eigenvalue of 



R 



1 M 

— ^<^(x l )^(x l ) 3 



(39) 



i=l 



Likewise, Pt is a projection operator onto the primary 
signal's subspace, 



P Tv = T V (T^T V )- J T 



T T 

-*- ip -*- if 



(40) 



Here, we assume that Pi can perfectly project </?( x ) as 
(f{x) in the feature space which is different from the method 
proposed in pT| , 



exp(-A || Wi || 2 ). 



In general, the parameters 0, <7q,<7i are unknown to us 
under which the GLRT approach is explored. In GLRT, the 
parameters 6, oo, o~i are replaced by their maximum likelihood 
estimates 8,&o,ai. The maximum likelihood estimate of 9 is 
equivalent to the least square estimate of Wi pT) , 

w = y 

wi = y — TO = (I — Pr)y- 



(33) 



v(y) T ^i v ^(y) = v(y) T v(y)- 



(41) 



Based on the derivation of kernel PC A, the eigenvectors 
corresponding to nonzero eigenvalues of the sample covariance 

matrix R v(x) are (^(xi),^(x 2 ),... ! ^(x A!f ))(^i,/32>->/ 3 K)- 
(3 1 , (3 2 , (3 K are eigenvectors corresponding to nonzero 
eigenvalues of K = (k(xj, Xj))^. K is the number of 
nonzero eigenvalues of K. Accordingly, <p(y) 7 Pt v <p{y) can 
be represented as 
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Samples of primary user's signal 



^(y) T TVTV T ^(y) = ^(y) T (^( Xl ), p(x 2 ), ¥>(x M ))- 



(/3 1 ,/3 2 ,...,/3 x )(/3 1 ,/3 2 ,...,/3^) T 

V </?(x A /) T / 
= (k(y,xi),k(y,x 2 ), ...,k(y,x M ))(/3i,/3 2l ->/3# 



^(y) 



(/3 1 ,/3 2 ,...,/3 if r j 



/ k(y,xi) \ 
k(y,x 2 ) 



\ k(y,x M ) / 



(42) 

The derivation of ( |38j ) is based on the assumption that the 
hypotheses T-Lq^,T-Li v obey Gaussian distributions. The paper 
pT| has claimed that, though without strict proof, if k is 
Gaussian kernel Tin ,Hi are still distributed as Gaussian's. 

Gaussian kernel is employed for the kernel GLRT approach, 
thus ip(y) T tp(y) = k(y,y) = 1. Substituting (|42) into ( 138) , 

l 



P = 



l-k£(/3 1: /3 2 ,... : /3 A .; 



( Pi \ 

T 

2 



k T 



(43) 



V 0k J 



in which 



kx = 



/ fc(y,x x ) \ 
fc(y,x 2 ) 



(44) 



V fc(y>XAf) / 

The centering of kx in the feature space pTj is 



kx = kn 



i 



i i 

M' M'' 



1 

M 



(45) 



The procedure of kernel GLRT for spectrum sensing based 
on Gaussian kernels without consideration of centering is 
summarized here as follows: 

1) Given a training set of the primary user's signal 
Xi, x 2 , • • • , xj\/, the kernel matrix is K = (k(xj,x,-))y. 
K is positive semidefinite. Eigen-decomposition of K to 
obtain eigenvectors /3 1; f3 2 , ■ ■ ■ , Pk corresponding to all 
of the nonzero eigenvalues. 

2) Normalize the received dimensional vector y by 

l|y|| 2 

3) Compute the kernel vector of kx by (j44j>. 

4) Compute the value of p defined in ( |43| l. 

5) Determine a threshold value Tk g i r t for a desired false 
alarm rate. 

6) Detect the presence or absence of x in y by checking 
P > Tkgirt or not. 



NAAAAAAM/ 



K = (k(x i ,x J )) ij 



Received signal 



|) Eigen-decomposition K 
to obtain eigenvectors 
P„P„...,P y 

A 



y = 


y 


J 


L 




1 








i-k/(p„p 2 ,...,p r ) 


P/ 


k T 









Determine a threshold T, 



Kernel vector 
k T T = (k(y , x, ) ,k(y , x 2 ) , . . . k(y, x„ )) 



p compare with T 



Fig. 2. The flow chart of the proposed kernel GLRT algorithm for spectrum 
sensing 

The flow chart of the proposed kernel GLRT algorithm for 
spectrum sensing with Gaussian kernels is shown in Fig. [2] 

The detection rate and false alarm rate for all of the above 
methods can be calculated by 



P d = prob(p > T\y = x + w) 
Pf = prob(p > T\y = w) 



(47) 



where T is the threshold value determined by each of the 
above algorithm. In general, threshold value is determined by 
false alarm rate of 10%. 

IV. Experiments 

The experimental results will be compared with the results 
of estimator-correlator (EC) J26| and maximum minimum 
eigenvalue (MME) 17). EC method assumes that the signal x 
follows zero mean Gaussian distribution with the covariance 
matrix £,., 



x:AT(0,£ x ), w:iV(0,a 2 I). 



(48) 



Both Sj; and a 2 are given priorly. Consequently, when signal 
x obeys Gaussian distribution, EC method is optimal. The 
hypothesis is Hi when 

p = y T S x (S :c + ( 7 2 I)- 1 y >T ec , (49) 

where T ec is the threshold value designed for the EC method. 
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MME is a totally blind method without any prior knowledge 
on the covariance matrix of the signal and a 2 . The hypothesis 
is Hi when 



A„ 



(50) 



where T mme is the threshold value designed for the MME 
method. A max and A m i n are the maximal and minimal eigen- 

M 

values of the sample covariance matrix H y = jj YiYf- 

PCA, kernel PCA, GLRT, and kernel GLRT methods con- 
sidered in this paper bear partial prior knowledge, that is, the 
sample covariance matrix of the signal is given priorly. 




Kernel PCA 
■ PCA 
▲ MME 
■♦■EC 



A. Experiments on the Simulated Sinusoidal Signal 

The primary user's signal assumes to be the sum of three 
sinusoidal functions with unit amplitude of each. The gener- 
ated sinusoidal samples with length L = 500 are taken as 
the samples of x(n). The training set xi, X2, • • • , xj\/ is taken 
from x(n) with d = 128 and i = 1. Received signal y(n) is the 
same length as x(n). Vectorized y(n) are yi, y2, ■ ■ ■ , Ym with 
d = 128 and i = 1. For the received vectors yi,y2, ■ ■ ■ , Ym, 
EC detection is implemented on every vector and then do 
average (same implementation for GLRT and kernel GLRT) 



1 



M 



, S x (S x + tr 2 I)- 1 y4 



(51) 



Polynomial kernel of order 2 with c = 1 is applied for 
kernel PCA. 

The detection rates varied by SNR for kernel PCA and PCA 
compared with EC and MME with Pf = 10% are shown in 
Fig. [3] for 1000 experiments. From Fig. [3] it can be seen that 
when SNR < -10 dB, kernel PCA is about 4 dB better than 
PCA method. Kernel PCA can compete with EC method but 
with less known prior knowledge. It should be noticed that the 
types of kernel functions and parameters in kernel functions 
can both affect the performance of the kernel PCA approach. 

The detection rates varied by SNR for kernel GLRT and 
GLRT compared with EC and MME with P f = 10% are 
shown in Fig. |4]for 1000 experiments. Kernel GLRT is still 
better than GLRT method. Kernel GLRT can even beat EC 
method. The underlying reason is that EC method assumes 
sinusoidal signal also following zero-mean Gaussian distribu- 
tion with the actual distribution of which being shown in Fig. 
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As is well known that sinusioal signal lies on a linear 
subspace which can be nearly perfectly estimated from the 
sample covariance matrix. Therefore, the matched subspace 
model for GLRT and kernel GLRT considered in this paper 
is more suitable for sinusoial signal. Gaussian kernel is used 
with the parameter a = The width of Gaussian kernel a 
is the major factor that affects the performance of the kernel 
GLRT approach. 

The calculated threshold values with Pf = 10% for kernel 
PCA, PCA, kernel GLRT, and GLRT methods are shown 
in Fig. [5] and Fig. [6] respectively. The threshold values are 
normalized by dividing the corresponding maximal values 

respectively. The threshold 



Fig. 3. The detection rates for kernel PCA and PCA compared with EC and 
MME with Pf = 10% for the simulated signal 

« * « « 




Kernel GLRT 
GLRT 
▲ MME 
■ ♦-EC 



Fig. 4. The detection rates for kernel GLRT and GLRT compared with EC 
and MME with Pf = 10% for the simulated signal 




in T n 



pcai ^kpcai -L girt 



T nM . and T, 



kglrt- 



Fig. 5. Normalized threshold values for kernel PCA and PCA 

values assigned for the kernel methods are more stable than 
the corresponding linear methods. 

The simulation results are tested by choosing the kernel 
function as k(Xj,Xy) =< Xi,Xj >. In this manner, the 
selected feature space is the original space. If the operations in 
the feature space and original space are identical, (for example, 
the centering is done in both of the spaces, and similarity 
measure is the inner-product for both PCA and kernel PCA), 
the results for kernel and corresponding linear methods should 
be the same. The tested results verified the correctness of the 



SNR indB 



SNR in dB 



Fig. 6. Normalized threshold values for kernel GLRT and GLRT 
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Fig. 7. Similarities of leading eigenvectors derived by PCA and kernel PCA 
between the first segment and other 199 segments 



simulation. 



Fig. 8. The detection rates for kernel PCA and PCA compared with EC and 
MME with P f = 10% for DTV signal 




Fig. 9. The detection rates for kernel GLRT and GLRT compared with EC 
and MME with P f = 10% for DTV signal 



B. Experiments on Captured DTV Signal 

DTV signal |23) captured in Washington D.C. will be 
employed to the experiment of spectrum sensing in this 
section. The first segment of DTV signal with L = 500 is 
taken as the samples of the primary user's signal x(n). 

First, the similarities of leading eigenvectors of the sample 
covariance matrix between first segment and other segments of 
DTV signal will be tested under the frameworks of PCA and 
kernel PCA. The DTV signal with length 10 5 is obtained and 
divided into 200 segments with the length of each segment 
500. Similarities of leading eigenvectors derived by PCA 
and kernel PCA between the first segment and the rest 199 
segments are shown in Fig. [7] The result shows that the 
similarities are very high between leading eigenvectors of 
different segment's DTV signal (which are all above 0.94), 
on the other hand, kernel PCA is more stable than PCA. 

The detection rates varied by SNR for kernel PCA and PCA 
(kernel GLRT and GLRT) compared with EC and MME with 

91 for 1000 experiments. 
( Fig.[TTJ for kernel PCA 
16, -20, -24 



Pf = 10% are shown in Fig. [8] (Fig 
The ROC curves are shown in Fig. [10 
and PCA (kernel GLRT and GLRT) with SNR 
dB. Experimental results show that kernel methods are 4 dB 
better than the corresponding linear methods. Kernel methods 
can compete with EC method. Howerver, kernel GLRT in 




Fig. 10. ROC curves for kernel PCA and PCA for DTV signal 



this example cannot beat EC method due to the fact that 
the distribution of DTV signal (shown in Fig. [12] | is more 
approximated Gaussian than the above simulated sinusoidal 
signal. Gaussian kernel with parameter a — ^| is applied 
for kernel GLRT. Polynomial kernel of order 2 with c = 1 is 
applied for kernel PCA. 
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Fig. 11. ROC curves for kernel GLRT and GLRT for DTV signal 
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Fig. 12. The histograms of sinusoidal and DTV signal 

V. Conclusion 

Kernel methods have been extensively and effectively ap- 
plied in machine learning. Kernel is a very powerful tool 
in machine learning. Kernel function can extend the linear 
method to nonlinear one by defining the inner-product of data 
in the feature space. The mapping from the original space to 
a higher dimensional feature space is indirectly defined by the 
kernel function. Kernel method makes the computation in an 
arbitrary dimensional feature space become possible. 

In this paper, the detection with the leading eigenvector 
under the framework of kernel PCA is proposed. The inner- 
product between leading eigenvectors is taken as the similarity 
measure for kernel PCA approach. The proposed algorithm 
makes the detection in an arbitrary dimensional feature space 
become possible. Kernel GLRT based on matched subspace 
model is also introduced to spectrum sensing. Different from 
pT| , the kernel GLRT approach proposed in this paper 
assumes that identity projection operator Pi is perfect in 
the feature space, that is, it can map y(x) as y(x). The 
background information is not considered in this paper. 

Experiments are conducted with both simulated sinusoidal 
signal and captured DTV signal. When the second order 
polynomial kernel with c = 1 is used for kernel PCA approach, 
the experimental results show that kernel PCA is 4 dB better 
than PCA whether on the simulated signal or DTV signal. 
Kernel PCA can compete with EC method. Kernel GLRT 
method is about 4 dB better than GLRT for DTV signal 
with appropriate choice of the width of Gaussian kernel's. 
Depending on the signal, kernel GLRT can even beat the EC 
method which owns the perfect prior knowledge. 



In this paper, the types of kernels and parameters in kernels 
are chosen manually by trial and error. How to choose an 
appropriate kernel function and parameter is still an open 
problem for us. In PCA and kernel PCA approaches, only 
the leading eigenvector is used for detection. Can both of 
the methods extend to the case that detection by subspaces 
consist of eigenvectors corresponding to nonzero eigenvalues? 
Motivated by kernel PCA approach, we know that a suitable 
choice of similarity measure is very important. What kind 
of similarity measure can be used for detection with the use 
of subspaces seems also an interesting and promising future 
direction. 
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